Load required package:
library(marmap)
Import bathymetry data from NOAA:
b = getNOAA.bathy(lon1 = 115, lon2 = 140, lat1 = 35, lat2 = 20, resolution = 1)
## Querying NOAA database ...
## This may take seconds to minutes, depending on grid size
## Building bathy matrix ...
Import dataframe with the latitude and longitude of sampling points from CTD (in decimal degrees):
pts <- read.csv("~/Desktop/MiraiFilters/Mirai_CTD_lat_long.csv")
pts$Station<- as.character(pts$Station)
str(pts)
## 'data.frame': 14 obs. of 3 variables:
## $ Lon : num 126 128 127 126 125 ...
## $ Lat : num 26.3 25.4 25.9 26.5 25.9 ...
## $ Station: chr "2" "3" "4" "5" ...
Make dataframe with map labels and their lat/long:
Clon = c(127.5,131.25, 121.6, 121.03, 120.8, 126)
Clat= c(26, 31.93, 31.1, 23, 27.2, 22)
CLabels= c("Okinawa", "Japan", "China", "Taiwan", "East China Sea", "Philippine Sea")
Countries = cbind.data.frame(Clon, Clat, CLabels)
str(Countries)
## 'data.frame': 6 obs. of 3 variables:
## $ Clon : num 128 131 122 121 121 ...
## $ Clat : num 26 31.9 31.1 23 27.2 ...
## $ CLabels: Factor w/ 6 levels "China","East China Sea",..: 4 3 1 6 2 5
Make a dataframe with vent sites:
vents <- read.csv("~/Desktop/MiraiFilters/VentSites.csv")
str(vents)
## 'data.frame': 25 obs. of 3 variables:
## $ Long: num 122 122 123 123 123 ...
## $ Lat : num 24.8 24.8 25.1 24.8 24.9 ...
## $ Name: Factor w/ 25 levels "Akuseki-jima",..: 11 12 21 24 4 5 20 22 23 8 ...
Define colors to use in plot:
blues <- c("lightsteelblue4", "lightsteelblue3", "lightsteelblue2", "lightsteelblue1")
greys <- c(grey(0.6), grey(0.93), grey(0.99))
Newblues <- colorRampPalette(c("red","purple","blue",
"cadetblue1","white"))
blues2<-c("blue",
"cadetblue1", "lightsteelblue2")
Make plot one layer at a time:
tiff("MiraiFiltersMap.tiff", height= 8, width =8, units = 'in', res=300) # uncomment to save plot to file
plot(b, image = TRUE, land = TRUE, xlim=c(120,135), ylim=c(22, 32), lwd = 0.03, bpal = list(c(0, max(b), grey(.7), grey(.9), grey(.95)),
c(min(b), 0, "darkblue", "lightblue")))
#plot(b, image = TRUE, land = TRUE, xlim=c(120,135), ylim=c(22, 32), lwd = 0.03, bpal = Newblues(100), cex.lab=1.5, cex.axis=1.5, cex.main=1.5, cex.sub=1.5)
#bathymetry colors
plot(b, lwd = 0.8, deep = 0, shallow = 0, step = 0, add = TRUE) # highlight coastline with solid black line
plot(b, deep=-200, shallow=-200, lwd = 0.4, drawlabels=T, add=T) # Add -200m isobath
plot(b, deep=-1000, shallow=-1000, lwd = 0.4, drawlabels=T, add=T) # Add -1000m isobath
plot(b, deep=-2000, shallow=-2000, lwd = 0.4, drawlabels=T, add=T) # Add -2000m isobath
plot(b, deep=-3000, shallow=-3000, lwd = 0.4, drawlabels=T, add=T) # Add -3000m isobath
points(pts, pch = 16, col = 'red') # Add sampling points
text(pts, labels=pts$Station, adj= 1.5, cex=1.2, size = 4) # Add sampling station numbers
text(Countries, labels=Countries$CLabels, cex= 1.7, adj = -0.2) # Add map labels
points(vents, pch = 17, col = 'blue', size = 3)
dev.off() #uncomment to save plot to file
## quartz_off_screen
## 2
Sample collection, DNA extraction, PCR, & sequencing
Seawater collected from the Deep Chlorophyll Maximum (DCM), mid-water column, and near-bottom was collected in 10 L Niskin bottles at each sampling station and surface seawater was collected by bucket. Five Liter replicates from each depth (4.5 from surface) were sequentially filtered through 10.0 and 0.2 um pore-size Polytetrafluoroethylene (PTFE) filters. Filters were flash frozen in liquid nitrogen and stored at -80 C until DNA extraction. DNA was extracted following the manufacturers protocols for the Qiagen/MoBio DNeasy PowerWater Kit, including the optional heating step to lyse cells. Amplicon PCR and sequencing library preparation followed the procedures described in the Illumina 16S Metagenomic Sequencing Library Preparation manual modified to include universal eukaryotic primers for the v4 region of the 18S rRNA gene (F: Stoeck et al. 2010, R: Brisbin et al. 2018) and amplicon PCR conditions most appropriate for these primers (annealing at 58C). The 220 samples were separated into 4 sequencing runs on the Illumina MiSeq sequencing platform with v3 chemistry for 300x300-bp paired-end sequencing.
Bioinformatic processing with Qiime2
Demultiplexed paired-end sequences provided by the OIST sequencing center were imported to Qiime2 (v2018.11) software (Bolyen et al. 2018). The Divisive Amplicon Denoising Algorithm (DADA) was implemented with the DADA2 plug-in for Qiime2 for quality filtering, chimera removal, and feature table construction (Callahan et al. 2015) separately for each sequencing run before merging the four resulting feature tables. Taxonomy was assigned to Amplicon Sequence Variants (ASVs) in the feature table by training a classifier on the Protist Ribosomal Reference (PR2) database to classify ASVs (Guillo et al. 2012). The feature table and assigned taxonomy were then exported from Qiime2 for further analysis.
Below is an example Qiime2 workflow for 1 sequencing run:
Activate an anaconda environment for qiime2: source activate qiime2-2018.11
Import sequencing data (all sequences for eacg sequencing run must be in their own directory): qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' --input-path ./seqs/seqrun1/ --input-format CasavaOneEightSingleLanePerSampleDirFmt --output-path run1_demux-paired-end.qza
Visualize squencing results (use these graphs to decide what length to trim sequences) qiime demux summarize --i-data run1_demux-paired-end.qza --o-visualization run1_demux.qzv
Run the DADA2 algorithm: qiime dada2 denoise-paired --i-demultiplexed-seqs run1_demux-paired-end.qza --output-dir ./dd2run1 --o-representative-sequences run1_rep-seqs --p-trim-left-f 10 --p-trim-left-r 10 --p-trunc-len-f 260 --p-trunc-len-r 250 --p-n-threads 3
Check results: qiime feature-table summarize --i-table ./dd2run1/table.qza --o-visualization ./dd2run1/table.qzv --m-sample-metadata-file ~/desktop/MiraiFilters/sampleMaptxt.txt
qiime metadata tabulate --m-input-file dd2run1/denoising_stats.qza --o-visualization dd2run1/denoising_stats.qzv
Merge feature tables from multiple sequencing runs: qiime feature-table merge --i-tables ./dd2run1/table.qza --i-tables ./dd2run2/table.qza --i-tables ./dd2run3/table.qza --i-tables ./dd2run4/table.qza --o-merged-table mergedtable.qza
qiime feature-table merge-seqs --i-data run1_rep-seqs.qza --i-data run2_rep-seqs.qza --i-data run3_rep-seqs.qza --i-data run4_rep-seqs.qza --o-merged-data merged-rep-seqs.qza
Import taxonomy database: qiime tools import --type 'FeatureData[Sequence]' --input-path pr2_version_4.11.1_mothur.fasta --output-path pr2.qza
qiime tools import --type 'FeatureData[Taxonomy]' --input-format HeaderlessTSVTaxonomyFormat --input-path pr2_version_4.11.1_mothur.tax -output-path pr2_tax.qza
Select v4 region from PR2 sequences: qiime feature-classifier extract-reads --i-sequences pr2.qza --p-f-primer CCAGCASCYGCGGTAATTCC --p-r-primer ACTTTCGTTCTTGATYRA --o-reads pr2_v4.qza
Train the classifier: qiime feature-classifier fit-classifier-naive-bayes --i-reference-reads pr2_v4.qza --i-reference-taxonomy pr2_tax.qza --o-classifier pr2_v4_classifier.qza
Classify: qiime feature-classifier classify-sklearn --i-classifier pr2_v4_classifier.qza --i-reads merged-rep-seqs.qza --o-classification taxonomy.qza
qiime metadata tabulate --m-input-file taxonomy.qza --o-visualization taxonomy.qzv
Export the .tsv taxonomy file from taxonomy.qzv for use in following steps.
Load packages:
library(tidyverse)
library(qiime2R)
library('phyloseq')
library('vegan')
library('ggplot2')
library(RColorBrewer)
library(DESeq2)
library(reshape)
library("wesanderson")
library("gridExtra")
library("metagMisc")
library("wesanderson")
library("breakaway")
library("CoDaSeq")
library("ggbiplot")
Set default colors:
cbPalette <- c("#999999", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7")
P <- brewer.pal(12, "Paired")
S2 <- brewer.pal(8, "Set2")
S1 <- brewer.pal(8, "Set1")
D2 <- brewer.pal(8, "Dark2")
colors<- c(P, S2, S1,D2)
colors2 <- rep(colors, 6)
ap<- c("#F1B6A1", "#D4A52A", "#E3E5DB", "#A5CFCC", "#0E899F", "#A83860", "#ED91BC", "#DB5339", "#F58851", "#42465C", "#1E479A", "#F7CDA4", "#CF529C", "#11638C")
Convert Qiime2 artifacts to phyloseq objects:
phyloseq<-qza_to_phyloseq(features="MiraiSeqs/mergedtable.qza")
## Warning in read_qza(features): Artifact was not generated with Qiime2
## 2018.4, if data is not successfully imported, please report here
## github.com/jbisanz/qiime2R/issues
Import sample data:
metatable <- read.csv("sampleMap.csv")
row.names(metatable) <- metatable[[1]]
metatable$Station <- as.character(metatable$Station)
metatable$filter <- as.character(metatable$filter)
metatable$VentANAHTM<-as.factor(metatable$VentANAHTM)
META<- sample_data(metatable)
str(META)
## 'data.frame': 219 obs. of 11 variables:
## Formal class 'sample_data' [package "phyloseq"] with 4 slots
## ..@ .Data :List of 11
## .. ..$ : Factor w/ 219 levels "A6-F173-DNA",..: 3 6 9 12 96 97 115 118 129 134 ...
## .. ..$ : chr "2" "2" "2" "2" ...
## .. ..$ : Factor w/ 4 levels "Bottom","Mid",..: 4 4 4 4 1 1 2 2 3 3 ...
## .. ..$ : int 0 0 0 0 1000 1000 700 700 92 92 ...
## .. ..$ : chr "10" "2" "10" "2" ...
## .. ..$ : Factor w/ 2 levels "n","y": 2 2 2 2 2 2 2 2 2 2 ...
## .. ..$ : Factor w/ 4 levels "KG","MOT","NOT",..: 1 1 1 1 1 1 1 1 1 1 ...
## .. ..$ : Factor w/ 3 levels "KG","NOT","SOT": 1 1 1 1 1 1 1 1 1 1 ...
## .. ..$ : Factor w/ 3 levels "n","p","y": 3 3 3 3 3 3 3 3 3 3 ...
## .. ..$ : Factor w/ 4 levels "KG","NOT","O",..: 1 1 1 1 1 1 1 1 1 1 ...
## .. ..$ : Factor w/ 2 levels "n","y": 2 2 2 2 2 2 2 2 2 2 ...
## ..@ names : chr "SampleID" "Station" "Depth" "m" ...
## ..@ row.names: chr "A8-F17-DNA" "B8-F18-DNA" "C8-F19-DNA" "D8-F20-DNA" ...
## ..@ .S3Class : chr "data.frame"
Load PR2 taxonomy:
taxonomy <- read.csv("MiraiSeqs/taxonomy.csv", stringsAsFactors = FALSE)
names(taxonomy) <- c("row", "tax", "Confidence")
row.names(taxonomy) <-taxonomy[[1]]
taxonomy <- taxonomy[,(-1)]
taxonomy <- separate(taxonomy, tax, c("D0","D1", "D2", "D3", "D4", "D5", "D6", "D7", "D8", "D9", "D10", "D11", "D12", "D13", "D14"), sep = ";", fill = "right")
taxonomy <- taxonomy[,c(1:8)]
taxmat <- as.matrix(taxonomy)
TAX = tax_table(taxmat)
str(TAX)
## Formal class 'taxonomyTable' [package "phyloseq"] with 1 slot
## ..@ .Data: chr [1:23115, 1:8] "Eukaryota" "Eukaryota" "Eukaryota" "Eukaryota" ...
## .. ..- attr(*, "dimnames")=List of 2
## .. .. ..$ : chr [1:23115] "000a220fe8edd771dd82f6915627ef3e" "000c79760dd16d6692018deb8ea1b164" "000d9eaa21eaff08f34fb1a1a87601d5" "000e6f109e01181b2b1eda5d6c99319c" ...
## .. .. ..$ : chr [1:8] "D0" "D1" "D2" "D3" ...
Add taxonomy to phyloseq object:
ps = merge_phyloseq(phyloseq, TAX, META)
prevdf = apply(X = otu_table(ps),
MARGIN = ifelse(taxa_are_rows(ps), yes = 1, no = 2),
FUN = function(x){sum(x > 0)})
prevdf = data.frame(Prevalence = prevdf,
TotalAbundance = taxa_sums(ps),
tax_table(ps))
#plyr::ddply(prevdf, "D2", function(df1){cbind(mean(df1$Prevalence),sum(df1$Prevalence))})
Prevalence plot:
prevplot1<-ggplot(prevdf, aes(TotalAbundance, Prevalence / nsamples(ps),color=D2)) +
geom_hline(yintercept = 0.05, alpha = 0.5, linetype = 2) + geom_point(size = 2, alpha = 0.7) +
theme_bw()+
scale_x_log10() + xlab("Total Abundance") + ylab("Prevalence [Frac. Samples]") +
facet_wrap(~D1) + theme(legend.position="none")
prevplot1
don’t apply prevalence filter
OTUs <- data.frame(otu_table(ps))
total
OTUsRS<- OTUs
OTUsRS$RowSum <- rowSums(OTUsRS)
OTUsRSnoZero <- OTUsRS$RowSum!=0
sum(OTUsRSnoZero)
## [1] 23115
per sample is this number a zero? true or false –> col sums = richness
OTUs0 <- OTUs!=0
csums <- colSums(OTUs0)
csumdf <- as.data.frame(csums)
max(csumdf$csums) #1906
## [1] 1906
min(csumdf$csums) #49
## [1] 49
mean(csumdf$csums) #735
## [1] 734.789
alpha diversity plot
p<-plot_richness(ps, x = "Depth", measures = c("Observed", "Shannon"), color = "filter") + geom_boxplot() + theme_bw() + scale_color_manual(values = c("#999999", "#0072B2")) + theme(text = element_text(size=14))
p$layers <- p$layers[-1]
p
#ggsave("alphadiversity_Obs_Shann.pdf", width = 7, height =4)
ba <- breakaway(ps)
badf<- summary(ba) %>%
add_column("SampleID" = ps %>% otu_table %>% sample_names)
badf<- merge(badf, metatable, by = "SampleID")
baPlot <- ggplot(badf, aes(x=Depth, y=estimate, fill=filter)) + geom_boxplot() + theme_bw() + scale_fill_manual(values = c("#999999", "#0072B2"), labels=c("10.0", "0.2")) + theme(text = element_text(size=14)) +ylab("Breakaway Alpha Diversity Estimate")
baPlot
#ggsave("breakaway_4depths.png", width = 6, height = 4)
bt <- betta(summary(ba)$estimate,
summary(ba)$error,
make_design_matrix(ps, "Depth"))
bt$table
## Estimates Standard Errors p-values
## (Intercept) 666.49293 20.47393 0.000
## predictorsMid -29.21947 41.39565 0.480
## predictorsSCM 329.00654 41.44555 0.000
## predictorsSurface -18.51460 40.34140 0.646
bottom
bottomraw<- subset_samples(ps, Depth=="Bottom")
bba <- breakaway(bottomraw)
bbadf<- summary(bba) %>%
add_column("SampleID" = bottomraw %>% otu_table %>% sample_names)
bbadf<- merge(bbadf, metatable, by = "SampleID")
bbadf$Station_f <- factor(bbadf$Station, levels=c("11", "10", "9", "8", "3", "4", "2", "5", "12", "13", "14", "15", "17", "18"))
bbaPlot <- ggplot(bbadf, aes(x=Station_f, y=estimate, fill=filter)) + geom_boxplot() + theme_bw() + scale_fill_manual(values = c("#999999", "#0072B2"), labels=c("10.0", "0.2")) + theme(text = element_text(size=14)) +ylab("Breakaway Alpha Diversity Estimate") +xlab("Station")
bbaPlot
#ggsave("breakaway_bottom.png", width = 6, height = 4)
bt <- betta(summary(bba)$estimate,
summary(bba)$error,
make_design_matrix(bottomraw, "VentPlume"))
bt$table
## Estimates Standard Errors p-values
## (Intercept) 660.6267 31.74286 0.000
## predictorsp -139.8353 117.67309 0.235
## predictorsy 110.9332 83.20845 0.182
bt <- betta(summary(bba)$estimate,
summary(bba)$error,
make_design_matrix(bottomraw, "VentANAHTM"))
bt$table
## Estimates Standard Errors p-values
## (Intercept) 648.7259 31.74286 0.00
## predictorsy 122.8341 83.20845 0.14
#table(tax_table(psN)[, "D1"], exclude = NULL)
ps<-subset_taxa(ps, D1 != "")
ps<-subset_taxa(ps, D1 != "NA")
ps<-subset_taxa(ps, D2 != "Metazoa")
OTUs <- data.frame(otu_table(ps))
is this number a zero? true or false –> col sums = richness
OTUs0 <- OTUs!=0
csums <- colSums(OTUs0)
csumdf <- as.data.frame(csums)
max(csumdf$csums) #1800
## [1] 1800
min(csumdf$csums) #45
## [1] 45
mean(csumdf$csums) #688
## [1] 688.8716
Remove incorrectly labeled samples:
mixedup <- c("F43", "F44", "F45", "F46", "F47", "F48", "F84")
'%ni%' <- Negate('%in%')
ps<- subset_samples(ps, SampleID %ni% mixedup)
OTU4clr<- data.frame(t(data.frame(otu_table(ps))))
row.names(OTU4clr) <- gsub("\\.", "", row.names(OTU4clr))
OTUs.clr <- codaSeq.clr(OTU4clr + 0.5, samples.by.row=TRUE)
OTU2 <- otu_table(as.matrix(OTUs.clr), taxa_are_rows = FALSE)
row.names(metatable) <- gsub("-", "", row.names(metatable))
META2<- sample_data(metatable)
psCLR <- phyloseq(OTU2,TAX,META2)
ordu = ordinate(psCLR, "PCoA", "euclidean")
p<-plot_ordination(psCLR, ordu, color="Depth", shape="filter")+theme_bw() +scale_color_manual(values=ap)+ geom_point(size=3) +theme(text = element_text(size=16))
p
physeqSurf<- subset_samples(psCLR, Depth == "Surface" | Depth =="SCM")
ordu = ordinate(physeqSurf, "PCoA", "euclidean")
p<-plot_ordination(physeqSurf, ordu, color="Depth", shape="filter")+theme_bw() +scale_color_manual(values=colors)+ geom_point(size=3) +ggtitle("Surface &SCM - CLR/Euclidean")
p
vegan PERMANOVA - CLR
are filter differences statistically significant?
set.seed(1) #makes results reproducible, otherwise, slightly different everytime you run the test
OTUs <- data.frame(otu_table(physeqSurf))
# filter sample data to include ONLY the samples included in this analysis. Otherwise, adonis will give an error.
meta <- metatable[row.names(metatable) %in% row.names(OTUs),]
adonis(vegdist(OTUs, method = "euclidean")~ filter*Depth , data = meta)
##
## Call:
## adonis(formula = vegdist(OTUs, method = "euclidean") ~ filter * Depth, data = meta)
##
## Permutation: free
## Number of permutations: 999
##
## Terms added sequentially (first to last)
##
## Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
## filter 1 3060 3060.2 0.90671 0.00852 0.565
## Depth 1 3554 3553.5 1.05287 0.00989 0.343
## filter:Depth 1 4897 4896.9 1.45089 0.01363 0.070 .
## Residuals 103 347634 3375.1 0.96795
## Total 106 359145 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The strongest determinant of clustering seems to be depth. To better observe clustering within depth groups, subset by depth and then filter type.
physeqSurf<- subset_samples(psCLR, Depth == "Surface" & filter == "10")
OTUsSurf10 <- data.frame(otu_table(physeqSurf))
meta <- metatable[row.names(metatable) %in% row.names(OTUsSurf10),]
Surface_10<- prcomp(OTUsSurf10)
plot(Surface_10, type="lines")
PCA_Surf10<-ggbiplot(Surface_10, var.axes = FALSE, groups= meta$Region_SKN, ellipse=TRUE ) +theme_bw() +scale_color_manual(values=colors)+ theme(text = element_text(size=14)) + theme(legend.position="bottom") +ggtitle("Surface 10.0")
PCA_Surf10
#ggsave("PCA_Surf10.png")
set.seed(1)
adonis(vegdist(OTUsSurf10, method = "euclidean") ~ Region_SKN, data = meta)
##
## Call:
## adonis(formula = vegdist(OTUsSurf10, method = "euclidean") ~ Region_SKN, data = meta)
##
## Permutation: free
## Number of permutations: 999
##
## Terms added sequentially (first to last)
##
## Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
## Region_SKN 2 5547 2773.3 1.7617 0.12353 0.005 **
## Residuals 25 39355 1574.2 0.87647
## Total 27 44902 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
tst<-adonis_pairwise(x=meta, dd=vegdist(OTUsSurf10, method = "euclidean"), group.var="Region_SKN")
tst$Adonis.tab
## Comparison R2 F df p p.adj
## 1 KG.NOT 0.12509963 2.5737713 1;18 0.002 0.0060
## 2 KG.SOT 0.06560932 0.9830262 1;14 0.455 0.4550
## 3 NOT.SOT 0.08497832 1.6716649 1;18 0.007 0.0105
physeqSurf2<- subset_samples(psCLR, Depth == "Surface" & filter == "2")
#tiff("variance_surface2.tiff", height= 8, width =8, units = 'in', res=70) # uncomment to save plot to file
OTUsSurf2 <- data.frame(otu_table(physeqSurf2))
meta <- metatable[row.names(metatable) %in% row.names(OTUsSurf2),]
surface2 <- prcomp(OTUsSurf2)
plot(surface2, type="lines")
#dev.off()
PCA_Surf2<-ggbiplot(surface2, var.axes = FALSE, groups= meta$Region_SKN, ellipse=TRUE ) +theme_bw() +scale_color_manual(values=colors)+ theme(text = element_text(size=14)) + theme(legend.position="bottom") +ggtitle("Surface 0.2")
PCA_Surf2
#ggsave("PCA_Surf2.png")
PERMANOVA
set.seed(1)
adonis(vegdist(OTUsSurf2, method = "euclidean") ~ Region_SKN, data = meta)
##
## Call:
## adonis(formula = vegdist(OTUsSurf2, method = "euclidean") ~ Region_SKN, data = meta)
##
## Permutation: free
## Number of permutations: 999
##
## Terms added sequentially (first to last)
##
## Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
## Region_SKN 2 8969 4484.5 2.6743 0.18224 0.001 ***
## Residuals 24 40246 1676.9 0.81776
## Total 26 49215 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Pairwise PERMANOVA
tst<-adonis_pairwise(x=meta, dd=vegdist(OTUsSurf2, method = "euclidean"), group.var="Region_SKN")
tst$Adonis.tab
## Comparison R2 F df p p.adj
## 1 KG.NOT 0.17442004 3.802855 1;18 0.001 0.0015
## 2 KG.SOT 0.09141829 1.308014 1;13 0.151 0.1510
## 3 NOT.SOT 0.13979305 2.762686 1;17 0.001 0.0015
physeqSCM<- subset_samples(psCLR, Depth == "SCM")
ordu = ordinate(physeqSCM, "PCoA", "euclidean")
p<-plot_ordination(physeqSCM, ordu, color="Station", shape="filter")+theme_bw() +scale_color_manual(values=ap)+ geom_point(size=3)
p
Again, strong clustering by filter type.
physeqSCM10<- subset_samples(psCLR, Depth == "SCM" & filter =="10")
OTUsSCM10 <- data.frame(otu_table(physeqSCM10))
meta <- metatable[row.names(metatable) %in% row.names(OTUsSCM10),]
SCM10 <- prcomp(OTUsSCM10)
plot(SCM10, type="lines")
PCA_SCM10<-ggbiplot(SCM10, var.axes = FALSE, groups= meta$Region_SKN, ellipse=TRUE ) +theme_bw() +scale_color_manual(values=colors)+ theme(text = element_text(size=14)) + theme(legend.position="bottom") +ggtitle("SCM 10.0")
PCA_SCM10
#ggsave("PCA_SCM10.png")
PERMANOVA
set.seed(1)
adonis(vegdist(OTUsSCM10, method = "euclidean") ~ Region_SKN, data = meta)
##
## Call:
## adonis(formula = vegdist(OTUsSCM10, method = "euclidean") ~ Region_SKN, data = meta)
##
## Permutation: free
## Number of permutations: 999
##
## Terms added sequentially (first to last)
##
## Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
## Region_SKN 2 10020 5009.8 1.4684 0.10902 0.011 *
## Residuals 24 81884 3411.9 0.89098
## Total 26 91904 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
pairwise PERMANOVA
tst<-adonis_pairwise(x=meta, dd=vegdist(OTUsSCM10, method = "euclidean"), group.var="Region_SKN")
tst$Adonis.tab
## Comparison R2 F df p p.adj
## 1 KG.NOT 0.09160135 1.714251 1;17 0.010 0.0285
## 2 KG.SOT 0.08413800 1.194278 1;13 0.174 0.1740
## 3 NOT.SOT 0.07617784 1.484270 1;18 0.019 0.0285
physeqSCM2<- subset_samples(psCLR, Depth == "SCM" & filter == "2")
OTUsSCM2 <- data.frame(otu_table(physeqSCM2))
meta <- metatable[row.names(metatable) %in% row.names(OTUsSCM2),]
SCM2 <- prcomp(OTUsSCM2)
plot(SCM2, type="lines")
PCA_SCM2<-ggbiplot(SCM2, var.axes = FALSE, groups= meta$Region_SKN, ellipse=TRUE ) +theme_bw() +scale_color_manual(values=colors)+ theme(text = element_text(size=14)) + theme(legend.position="bottom") +ggtitle("SCM 0.2")
PCA_SCM2
#ggsave("PCA_SCM2.png")
PERMANOVA
set.seed(1)
adonis(vegdist(OTUsSCM2, method = "euclidean") ~ Region_SKN, data = meta)
##
## Call:
## adonis(formula = vegdist(OTUsSCM2, method = "euclidean") ~ Region_SKN, data = meta)
##
## Permutation: free
## Number of permutations: 999
##
## Terms added sequentially (first to last)
##
## Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
## Region_SKN 2 6676 3337.9 1.1825 0.09707 0.153
## Residuals 22 62100 2822.7 0.90293
## Total 24 68776 1.00000
physeqMid<- subset_samples(psCLR, Depth == "Mid")
ordu = ordinate(physeqMid, "PCoA", "euclidean")
p<-plot_ordination(physeqMid, ordu, color="Station", shape = "filter")+theme_bw() +scale_color_manual(values=ap)+ geom_point(size=3)
p + theme(legend.position="bottom") +ggtitle("Euclidean dist with CLR, Mid")
Interesting …. Filter effect shows up in CLR/Euc
Split by filter…
physeqMid10<- subset_samples(psCLR, Depth == "Mid" & filter == "10")
OTUsMid10 <- data.frame(otu_table(physeqMid10))
meta <- metatable[row.names(metatable) %in% row.names(OTUsMid10),]
Mid10 <- prcomp(OTUsMid10)
plot(Mid10, type="lines")
PCA_Mid10<-ggbiplot(Mid10, var.axes = FALSE, groups= meta$Region_SKN, ellipse=TRUE ) +theme_bw() +scale_color_manual(values=colors)+ theme(text = element_text(size=14)) + theme(legend.position="bottom") +ggtitle("Mid 10.0")
PCA_Mid10
ggsave("PCA_Mid10.png")
## Saving 8 x 6.5 in image
PERMANOVA by Region
adonis(vegdist(OTUsMid10, method = "euclidean") ~ Region_SKN, data = meta, permutations = 999)
##
## Call:
## adonis(formula = vegdist(OTUsMid10, method = "euclidean") ~ Region_SKN, data = meta, permutations = 999)
##
## Permutation: free
## Number of permutations: 999
##
## Terms added sequentially (first to last)
##
## Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
## Region_SKN 2 4934 2466.8 1.2673 0.09926 0.079 .
## Residuals 23 44769 1946.5 0.90074
## Total 25 49703 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Not Significant!
physeqMid2<- subset_samples(psCLR, Depth == "Mid" & filter == "2")
OTUsMid2 <- data.frame(otu_table(physeqMid2))
meta <- metatable[row.names(metatable) %in% row.names(OTUsMid2),]
Mid2 <- prcomp(OTUsMid2)
plot(Mid2, type="lines")
PCA_Mid2<-ggbiplot(Mid2, var.axes = FALSE, groups= meta$Region_SKN, ellipse=TRUE ) +theme_bw() +scale_color_manual(values=colors)+ theme(text = element_text(size=14)) + theme(legend.position="bottom") +ggtitle("Mid 0.2")
PCA_Mid2
ggsave("PCA_Mid2.png")
## Saving 8 x 6.5 in image
PERMANOVA by Region
set.seed(1)
adonis(vegdist(OTUsMid2, method = "euclidean") ~ Region_SKN, data = meta)
##
## Call:
## adonis(formula = vegdist(OTUsMid2, method = "euclidean") ~ Region_SKN, data = meta)
##
## Permutation: free
## Number of permutations: 999
##
## Terms added sequentially (first to last)
##
## Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
## Region_SKN 2 6883 3441.3 1.6878 0.13302 0.001 ***
## Residuals 22 44857 2038.9 0.86698
## Total 24 51739 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
pairwise PERMANOVA
tst<-adonis_pairwise(x=meta, dd=vegdist(OTUsMid2, method = "euclidean"), group.var="Region_SKN")
tst$Adonis.tab
## Comparison R2 F df p p.adj
## 1 KG.NOT 0.12483288 2.139583 1;15 0.009 0.0270
## 2 KG.SOT 0.11303007 1.401773 1;11 0.062 0.0620
## 3 NOT.SOT 0.07811164 1.525141 1;18 0.019 0.0285
North to South = diff KG to North = diff
physeqBottom<- subset_samples(psCLR, Depth == "Bottom")
ordu = ordinate(physeqBottom, "PCoA", "euclidean")
p<-plot_ordination(physeqBottom, ordu, color="Station", shape = "filter")+theme_bw() +scale_color_manual(values=ap)+ geom_point(size=3)
p + theme(legend.position="bottom") +ggtitle("Euclidean dist with CLR, Bottom")
No clustering by filter type.
For consistency, split by filter type for the NOT/SOT/KG biogeography:
physeqBottom10<- subset_samples(psCLR, Depth == "Bottom" & filter == "10")
OTUsBottom10 <- data.frame(otu_table(physeqBottom10))
meta <- metatable[row.names(metatable) %in% row.names(OTUsBottom10),]
Bottom10 <- prcomp(OTUsBottom10)
plot(Bottom10, type="lines")
PCA_Bottom10<-ggbiplot(Bottom10, var.axes = FALSE, groups= meta$Region_SKN, ellipse=TRUE ) +theme_bw() +scale_color_manual(values=colors)+ theme(text = element_text(size=14)) + theme(legend.position="bottom") +ggtitle("Bottom 10.0")
PCA_Bottom10
PERMANOVA by Region
set.seed(1)
adonis(vegdist(OTUsBottom10, method = "euclidean") ~ Region_SKN , data = meta)
##
## Call:
## adonis(formula = vegdist(OTUsBottom10, method = "euclidean") ~ Region_SKN, data = meta)
##
## Permutation: free
## Number of permutations: 999
##
## Terms added sequentially (first to last)
##
## Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
## Region_SKN 2 5815 2907.6 1.2317 0.09309 0.068 .
## Residuals 24 56656 2360.7 0.90691
## Total 26 62471 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
not significant
physeqBottom2<- subset_samples(psCLR, Depth == "Bottom" & filter == "2")
OTUsBottom2 <- data.frame(otu_table(physeqBottom2))
meta <- metatable[row.names(metatable) %in% row.names(OTUsBottom2),]
Bottom2 <- prcomp(OTUsBottom2)
plot(Bottom2, type="lines")
PCA_Bottom2<-ggbiplot(Bottom2, var.axes = FALSE, groups= meta$Region_SKN, ellipse=TRUE ) +theme_bw() +scale_color_manual(values=colors)+ theme(text = element_text(size=14)) + theme(legend.position="bottom") +ggtitle("Bottom 0.2")
PCA_Bottom2
PERMANOVA by region
set.seed(1)
adonis(vegdist(OTUsBottom2, method = "euclidean") ~ Region_SKN, data = meta)
##
## Call:
## adonis(formula = vegdist(OTUsBottom2, method = "euclidean") ~ Region_SKN, data = meta)
##
## Permutation: free
## Number of permutations: 999
##
## Terms added sequentially (first to last)
##
## Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
## Region_SKN 2 7126 3563.2 1.512 0.1162 0.007 **
## Residuals 23 54200 2356.5 0.8838
## Total 25 61327 1.0000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Pairwise PERMANOVA
tst<-adonis_pairwise(x=meta, dd=vegdist(OTUsBottom2, method = "euclidean"), group.var="Region_SKN")
tst$Adonis.tab
## Comparison R2 F df p p.adj
## 1 KG.NOT 0.06910372 1.187737 1;16 0.165 0.165
## 2 KG.SOT 0.09775075 1.408435 1;13 0.030 0.045
## 3 NOT.SOT 0.10568592 2.008982 1;17 0.001 0.003
For Vent v. Plume - use 10.0 and 0.2 samples together - needed bc low number of vent sites compared to non vent sites PERMANOVA
set.seed(1)
physeqBottom<- subset_samples(psCLR, Depth == "Bottom")
OTUsBottom <- data.frame(otu_table(physeqBottom))
meta <- metatable[row.names(metatable) %in% row.names(OTUsBottom),]
adonis(vegdist(OTUsBottom, method = "euclidean") ~ VentPlume, data = meta)
##
## Call:
## adonis(formula = vegdist(OTUsBottom, method = "euclidean") ~ VentPlume, data = meta)
##
## Permutation: free
## Number of permutations: 999
##
## Terms added sequentially (first to last)
##
## Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
## VentPlume 2 9303 4651.6 1.9195 0.0713 0.001 ***
## Residuals 50 121170 2423.4 0.9287
## Total 52 130473 1.0000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
pairwise PERMANOVA
tst<-adonis_pairwise(x=meta, dd=vegdist(OTUsBottom, method = "euclidean"), group.var="VentPlume")
tst$Adonis.tab
## Comparison R2 F df p p.adj
## 1 n.p 0.03681769 1.6436771 1;43 0.012 0.018
## 2 n.y 0.04858874 2.4002983 1;47 0.001 0.003
## 3 p.y 0.08956238 0.9837289 1;10 0.432 0.432
Bottom <- prcomp(OTUsBottom)
plot(Bottom, type="lines")
PCA_Bottom<-ggbiplot(Bottom, var.axes = FALSE, groups= meta$VentPlume, ellipse=TRUE ) +theme_bw() +scale_color_manual(values= c(wes_palette("Cavalcanti1")[4], wes_palette("Cavalcanti1")[1], wes_palette("Cavalcanti1")[5] ))+ theme(text = element_text(size=14)) + theme(legend.position="bottom") +ggtitle("Bottom")
PCA_Bottom
Prepare metadata table:
metatable <- read.csv("sampleMap.csv")
row.names(metatable) <- metatable[[1]]
metatable$desc <- paste(metatable$Station, metatable$Depth, metatable$filter, sep="-")
metatable <- metatable[metatable$SampleID %ni% mixedup,]
metatable$Depth <- as.character(metatable$Depth)
metatable$Station <-as.character(metatable$Station)
metatable$filter<-as.character(metatable$filter)
metatable<- metatable[,-which(names(metatable) == "SampleID")]
META<- sample_data(metatable)
ps<- subset_samples(ps, SampleID %ni% mixedup)
ps2<- ps
sample_data(ps2) <- META
print(ps2)
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 19923 taxa and 211 samples ]
## sample_data() Sample Data: [ 211 samples by 11 sample variables ]
## tax_table() Taxonomy Table: [ 19923 taxa by 8 taxonomic ranks ]
str(sample_data(ps2))
## 'data.frame': 211 obs. of 11 variables:
## Formal class 'sample_data' [package "phyloseq"] with 4 slots
## ..@ .Data :List of 11
## .. ..$ : chr "11" "10" "10" "10" ...
## .. ..$ : chr "Mid" "Bottom" "Mid" "SCM" ...
## .. ..$ : int 700 1510 700 85 85 85 700 700 1510 1510 ...
## .. ..$ : chr "2" "2" "10" "2" ...
## .. ..$ : Factor w/ 2 levels "n","y": 2 2 2 2 2 2 2 2 2 2 ...
## .. ..$ : Factor w/ 4 levels "KG","MOT","NOT",..: 4 4 4 4 4 4 4 4 4 4 ...
## .. ..$ : Factor w/ 3 levels "KG","NOT","SOT": 3 3 3 3 3 3 3 3 3 3 ...
## .. ..$ : Factor w/ 3 levels "n","p","y": 2 3 3 3 3 3 3 3 3 3 ...
## .. ..$ : Factor w/ 4 levels "KG","NOT","O",..: 4 4 4 4 4 4 4 4 4 4 ...
## .. ..$ : Factor w/ 2 levels "n","y": 1 2 2 2 2 2 2 2 2 2 ...
## .. ..$ : chr "11-Mid-2" "10-Bottom-2" "10-Mid-10" "10-SCM-2" ...
## ..@ names : chr "Station" "Depth" "m" "filter" ...
## ..@ row.names: chr "F142" "F150" "F151" "F154" ...
## ..@ .S3Class : chr "data.frame"
Merge:
mergedps <- merge_samples(ps2, "desc")
print(mergedps)
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 19923 taxa and 112 samples ]
## sample_data() Sample Data: [ 112 samples by 11 sample variables ]
## tax_table() Taxonomy Table: [ 19923 taxa by 8 taxonomic ranks ]
number of samples reduced from 211 to 112
sample_names(mergedps)
## [1] "10-Bottom-10" "10-Bottom-2" "10-Mid-10" "10-Mid-2"
## [5] "10-SCM-10" "10-SCM-2" "10-Surface-10" "10-Surface-2"
## [9] "11-Bottom-10" "11-Bottom-2" "11-Mid-10" "11-Mid-2"
## [13] "11-SCM-10" "11-SCM-2" "11-Surface-10" "11-Surface-2"
## [17] "12-Bottom-10" "12-Bottom-2" "12-Mid-10" "12-Mid-2"
## [21] "12-SCM-10" "12-SCM-2" "12-Surface-10" "12-Surface-2"
## [25] "13-Bottom-10" "13-Bottom-2" "13-Mid-10" "13-Mid-2"
## [29] "13-SCM-10" "13-SCM-2" "13-Surface-10" "13-Surface-2"
## [33] "14-Bottom-10" "14-Bottom-2" "14-Mid-10" "14-Mid-2"
## [37] "14-SCM-10" "14-SCM-2" "14-Surface-10" "14-Surface-2"
## [41] "15-Bottom-10" "15-Bottom-2" "15-Mid-10" "15-Mid-2"
## [45] "15-SCM-10" "15-SCM-2" "15-Surface-10" "15-Surface-2"
## [49] "17-Bottom-10" "17-Bottom-2" "17-Mid-10" "17-Mid-2"
## [53] "17-SCM-10" "17-SCM-2" "17-Surface-10" "17-Surface-2"
## [57] "18-Bottom-10" "18-Bottom-2" "18-Mid-10" "18-Mid-2"
## [61] "18-SCM-10" "18-SCM-2" "18-Surface-10" "18-Surface-2"
## [65] "2-Bottom-10" "2-Bottom-2" "2-Mid-10" "2-Mid-2"
## [69] "2-SCM-10" "2-SCM-2" "2-Surface-10" "2-Surface-2"
## [73] "3-Bottom-10" "3-Bottom-2" "3-Mid-10" "3-Mid-2"
## [77] "3-SCM-10" "3-SCM-2" "3-Surface-10" "3-Surface-2"
## [81] "4-Bottom-10" "4-Bottom-2" "4-Mid-10" "4-Mid-2"
## [85] "4-SCM-10" "4-SCM-2" "4-Surface-10" "4-Surface-2"
## [89] "5-Bottom-10" "5-Bottom-2" "5-Mid-10" "5-Mid-2"
## [93] "5-SCM-10" "5-SCM-2" "5-Surface-10" "5-Surface-2"
## [97] "8-Bottom-10" "8-Bottom-2" "8-Mid-10" "8-Mid-2"
## [101] "8-SCM-10" "8-SCM-2" "8-Surface-10" "8-Surface-2"
## [105] "9-Bottom-10" "9-Bottom-2" "9-Mid-10" "9-Mid-2"
## [109] "9-SCM-10" "9-SCM-2" "9-Surface-10" "9-Surface-2"
But metatable inside phyloseq object was disturbed …
str(sample_data(mergedps))
## 'data.frame': 112 obs. of 11 variables:
## Formal class 'sample_data' [package "phyloseq"] with 4 slots
## ..@ .Data :List of 11
## .. ..$ : num 10 10 10 10 10 10 10 10 11 11 ...
## .. ..$ : num NA NA NA NA NA NA NA NA NA NA ...
## .. ..$ : num 1510 1510 700 700 85 85 0 0 1210 1210 ...
## .. ..$ : num 10 2 10 2 10 2 10 2 10 2 ...
## .. ..$ : num 2 2 2 2 2 2 2 2 2 2 ...
## .. ..$ : num 4 4 4 4 4 4 4 4 4 4 ...
## .. ..$ : num 3 3 3 3 3 3 3 3 3 3 ...
## .. ..$ : num 3 3 3 3 3 3 3 3 2 2 ...
## .. ..$ : num 4 4 4 4 4 4 4 4 4 4 ...
## .. ..$ : num 2 2 2 2 2 2 2 2 1 1 ...
## .. ..$ : num NA NA NA NA NA NA NA NA NA NA ...
## ..@ names : chr "Station" "Depth" "m" "filter" ...
## ..@ row.names: chr "10-Bottom-10" "10-Bottom-2" "10-Mid-10" "10-Mid-2" ...
## ..@ .S3Class : chr "data.frame"
Fix meta table in phyloseq object:
meta2<-as.data.frame(sample_data(mergedps))
split<- do.call(rbind, strsplit(row.names(meta2), "-"))
meta2$Depth <- split[,2]
meta2$filter<-split[,3]
meta2$desc <- row.names(meta2)
meta2$Station <- as.character(meta2$Station)
meta2$Depth_f <- factor(meta2$Depth, levels=c('Surface','SCM','Mid','Bottom'))
meta2$Station_f <- factor(meta2$Station, levels=c("11", "10", "9", "8", "3", "4", "2", "5", "12", "13", "14", "15", "17", "18"))
META <-sample_data(meta2)
sample_data(mergedps)<-META
Fixed!
str(sample_data(mergedps))
## 'data.frame': 112 obs. of 13 variables:
## Formal class 'sample_data' [package "phyloseq"] with 4 slots
## ..@ .Data :List of 13
## .. ..$ : chr "10" "10" "10" "10" ...
## .. ..$ : chr "Bottom" "Bottom" "Mid" "Mid" ...
## .. ..$ : num 1510 1510 700 700 85 85 0 0 1210 1210 ...
## .. ..$ : chr "10" "2" "10" "2" ...
## .. ..$ : num 2 2 2 2 2 2 2 2 2 2 ...
## .. ..$ : num 4 4 4 4 4 4 4 4 4 4 ...
## .. ..$ : num 3 3 3 3 3 3 3 3 3 3 ...
## .. ..$ : num 3 3 3 3 3 3 3 3 2 2 ...
## .. ..$ : num 4 4 4 4 4 4 4 4 4 4 ...
## .. ..$ : num 2 2 2 2 2 2 2 2 1 1 ...
## .. ..$ : chr "10-Bottom-10" "10-Bottom-2" "10-Mid-10" "10-Mid-2" ...
## .. ..$ : Factor w/ 4 levels "Surface","SCM",..: 4 4 3 3 2 2 1 1 4 4 ...
## .. ..$ : Factor w/ 14 levels "11","10","9",..: 2 2 2 2 2 2 2 2 1 1 ...
## ..@ names : chr "Station" "Depth" "m" "filter" ...
## ..@ row.names: chr "10-Bottom-10" "10-Bottom-2" "10-Mid-10" "10-Mid-2" ...
## ..@ .S3Class : chr "data.frame"
Taxa glom - for clean RA plots
mergedps<- subset_taxa(mergedps, D2 != "Metazoa")
mergedps<-subset_taxa(mergedps, D1 != "")
mergedps<-subset_taxa(mergedps, D1 != "NA")
mergedps = tax_glom(mergedps, "D2")
LowPrev<- c("Alveolata_X", "Amoebozoa_X", "Apicomplexa", "Apusomonadidae", "Conosa", "Discoba", "Eukaryota_XX", "Foraminifera", "Hilomonadea","Fungi", "Katablepharidophyta", "Lobosa", "Mesomycetozoa", "Metamonada", "Metazoa", "Opisthokonta_X", "Perkinsea", "Streptophyta", "Rhodophyta","Telonemia", "")
mergedHighPrev<- subset_taxa(mergedps, D2 %ni% LowPrev)
mergedRAhp <- transform_sample_counts(mergedHighPrev, function(OTU) 100* OTU/sum(OTU))
ap<- c("#F1B6A1", "#F58851", "#E3E5DB", "#A5CFCC", "#11638C", "#A83860","#D4A52A", "#CF529C", "#1E479A", "#F7CDA4", "#42465C", "#0E899F", "#DB5339")
taxabarplot<-plot_bar(mergedRAhp, x= "Station_f", fill = "D2", facet_grid=filter~Depth_f) + scale_y_continuous(expand = c(0, 0)) + ggtitle("") + scale_fill_manual(values= ap) + theme(legend.title=element_blank()) + geom_bar(aes(fill=D2), stat="identity", position="stack", width =0.9) +theme_classic() + theme(text = element_text(size=14))+theme(axis.text.x = element_text(angle = 90)) + xlab("Station") +ylab("Relative Abundance(%)")
taxaplot_nolegend <- taxabarplot + theme(legend.position="none")
taxaplot_nolegend
Dj1<- wes_palette("Darjeeling1")
Dj<- wes_palette("Darjeeling2")
Cv <- wes_palette("Cavalcanti1")
Gb<- wes_palette("GrandBudapest1")
Gb2<- wes_palette("GrandBudapest2")
Br<- wes_palette("BottleRocket2")
Rm<- wes_palette("Rushmore1")
R1 <- wes_palette("Royal1")
R2<-wes_palette("Royal2")
mr3<- wes_palette("Moonrise3")
C1<- wes_palette("Chevalier1")
wpcolor<- c( Dj1[1:4], "blue", Dj[1:4], R2[3],Cv[1:3],Br[1:4], Rm[2:4], Gb2,mr3,Gb[2:4], "white" )
wpcolor[17]<- "forestgreen"
wpcolor[26]<-"grey"
wpcolor[21]<-"black"
mergedRA <- transform_sample_counts(mergedps, function(OTU) 100* OTU/sum(OTU))
taxabarplot<-plot_bar(mergedRA, x= "Station_f", fill = "D2", facet_grid=filter~Depth_f) + scale_y_continuous(expand = c(0, 0)) + ggtitle("") + scale_fill_manual(values=wpcolor) + theme(legend.title=element_blank()) + geom_bar(aes(fill=D2), stat="identity", position="stack", width =0.9) +theme_classic() + theme(text = element_text(size=14))+theme(axis.text.x = element_text(angle = 90)) + xlab("Station") +ylab("Relative Abundance(%)")
taxaplot_nolegend <- taxabarplot + theme(legend.position="none")
taxaplot_nolegend
ggsave("taxaplot_nolegend.pdf", width = 8, height = 5)
g_legend<-function(a.gplot){
tmp <- ggplot_gtable(ggplot_build(a.gplot))
leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
legend <- tmp$grobs[[leg]]
legend
}
legend <- g_legend(taxabarplot)
plegend <- grid.arrange(legend)
wpcolor2<- c(Br[1:3], Rm[3:5], R1[1:2], C1[1], C1[3:4])
taxabarplotD1<-plot_bar(mergedRA, x= "Station_f", fill = "D1", facet_grid=filter~Depth_f) + scale_y_continuous(expand = c(0, 0)) + ggtitle("") + scale_fill_manual(values=wpcolor2) + theme(legend.title=element_blank()) + geom_bar(aes(fill=D1), stat="identity", position="stack", width =0.9) +theme_classic() + theme(text = element_text(size=14))+theme(axis.text.x = element_text(angle = 90)) + xlab("Station") +ylab("Relative Abundance(%)")
taxaplotD1_nolegend <- taxabarplotD1 + theme(legend.position="none")
taxaplotD1_nolegend
g_legend<-function(a.gplot){
tmp <- ggplot_gtable(ggplot_build(a.gplot))
leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
legend <- tmp$grobs[[leg]]
legend
}
legend <- g_legend(taxabarplotD1)
D1legend <- grid.arrange(legend)
BottomTaxa<-subset_samples(mergedRA, Depth == "Bottom")
Bottombarplot<-plot_bar(BottomTaxa, x= "Station_f", fill = "D2", facet_grid=~filter) + scale_y_continuous(expand = c(0, 0)) + ggtitle("") + scale_fill_manual(values=wpcolor) + theme(legend.title=element_blank()) + geom_bar(aes(fill=D2), stat="identity", position="stack", width =0.9) +theme_classic() + theme(text = element_text(size=14))+theme(axis.text.x = element_text(angle = 90)) + xlab("Station") +ylab("Relative Abundance(%)")
Bottombarplot_nolegend <- Bottombarplot+ theme(legend.position="none")
Bottombarplot_nolegend
ggsave("Bottombarplot_nolegend.pdf", width=8, height =4)
subset rawcount phyloseq object to just hold bottom samples
bottomraw<- subset_samples(ps, Depth=="Bottom")
First, only station 2 and station 10 are considered vents
ddse <- phyloseq_to_deseq2(bottomraw, ~VentANAHTM)
## converting counts to integer mode
ddse2 <- DESeq(ddse, test="Wald", fitType="parametric")
res<- results(ddse2, cooksCutoff = FALSE)
alpha = 0.05
sigtab = res[which(res$padj < alpha), ]
head(sigtab)
## log2 fold change (MLE): VentANAHTM y vs n
## Wald test p-value: VentANAHTM y vs n
## DataFrame with 6 rows and 6 columns
## baseMean log2FoldChange
## <numeric> <numeric>
## 0595b46cbc5ccfb36ee584aa4bba1ab2 1.47374676747638 23.0718511423123
## 062eda4af9bf284c4f90a12a56c70528 15.0330428805362 -20.3280575624619
## 1999b708a0fc267d4b2689f846c8fc5d 43.4377717005511 -22.7039460856944
## 264bcfe948e3249d0351741f0ff2de89 17.1218935691345 -22.1527615190764
## 2d606b975a788dd2e2d30370526924cb 11.8913737552474 -22.1583793820037
## 4147786eeaf2d8e0def7eaff9e82ac85 9.92472807723299 25.7320482337882
## lfcSE stat
## <numeric> <numeric>
## 0595b46cbc5ccfb36ee584aa4bba1ab2 4.04436482692962 5.70469087968701
## 062eda4af9bf284c4f90a12a56c70528 2.90826082281126 -6.98976426151895
## 1999b708a0fc267d4b2689f846c8fc5d 4.09406649683235 -5.54557335677397
## 264bcfe948e3249d0351741f0ff2de89 2.84279000952778 -7.79261269556672
## 2d606b975a788dd2e2d30370526924cb 3.17412268665862 -6.98094609737021
## 4147786eeaf2d8e0def7eaff9e82ac85 4.04178874952267 6.36650003957458
## pvalue padj
## <numeric> <numeric>
## 0595b46cbc5ccfb36ee584aa4bba1ab2 1.16554415268086e-08 2.28228114396821e-06
## 062eda4af9bf284c4f90a12a56c70528 2.75348510517914e-12 1.02065760083012e-09
## 1999b708a0fc267d4b2689f846c8fc5d 2.92992462963282e-08 5.39967874390567e-06
## 264bcfe948e3249d0351741f0ff2de89 6.56375825159134e-15 3.42737576703928e-12
## 2d606b975a788dd2e2d30370526924cb 2.93198800110791e-12 1.02065760083012e-09
## 4147786eeaf2d8e0def7eaff9e82ac85 1.93390250474416e-10 4.66070503643343e-08
dim(sigtab)
## [1] 24 6
sigtab = cbind(as(sigtab, "data.frame"), as(tax_table(bottomraw)[rownames(sigtab), ], "matrix"))
AND if station 11 is a vent … First, only station 2 and station 10 are considered vents
ddse <- phyloseq_to_deseq2(bottomraw, ~Vent)
## converting counts to integer mode
ddse2 <- DESeq(ddse, test="Wald", fitType="parametric")
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 3726 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
res<- results(ddse2, cooksCutoff = FALSE)
alpha = 0.05
sigtab = res[which(res$padj < alpha), ]
dim(sigtab)
## [1] 70 6
dim(sigtab[sigtab$log2FoldChange >0,])
sigtab = cbind(as(sigtab, "data.frame"), as(tax_table(bottomraw)[rownames(sigtab), ], "matrix"))
#sigtab$log2FoldChange <- sigtab$log2FoldChange * -1
x = tapply(sigtab$log2FoldChange, sigtab$D2, function(x) max(x))
x = sort(x, TRUE)
sigtab$D2 = factor(as.character(sigtab$D2), levels=names(x))
#D3 level
x = tapply(sigtab$log2FoldChange, sigtab$D3, function(x) max(x))
x = sort(x, TRUE)
sigtab$D3 = factor(as.character(sigtab$D3), levels=names(x))
# D4 level
x = tapply(sigtab$log2FoldChange, sigtab$D4, function(x) max(x))
x = sort(x, TRUE)
sigtab$D4 = factor(as.character(sigtab$D4), levels=names(x))
#plot F0
sigplot<- ggplot(sigtab, aes(x=D2, y=log2FoldChange, color = D3)) + geom_point(size=5, alpha = 0.7) + theme(legend.title=element_blank()) + theme_bw() +
ggtitle(" ") +
theme(axis.text.x = element_text(angle = -45, hjust = 0, vjust=0.75)) + scale_color_manual(values=colors) + theme(text = element_text(size=16))
noLegendsigplot <- sigplot +theme(legend.position = "none")
noLegendsigplot
## Warning: Removed 1 rows containing missing values (geom_point).
#ggsave("sigplotnoLegend.png", height = 4, width = 6)
sigplot
## Warning: Removed 1 rows containing missing values (geom_point).
#ggsave("sigplot.png")